Expression Rematerialization for VLIW DSP Processors with Distributed Register Files

نویسندگان

  • Chung-Ju Wu
  • Chia-Han Lu
چکیده

Spill code is the overhead of memory load/store behavior if the available registers are not sufficient to map live ranges during the process of register allocation. Previously, works have been proposed to reduce spill code for the unified register file. For reducing power and cost in design of VLIW DSP processors, distributed register files and multibank register architectures are being adopted to eliminate the amount of read/write ports between functional units and registers. This presents new challenges for devising compiler optimization schemes for such architectures. This paper aims at addressing the issues of reducing spill code via rematerialization for a VLIW DSP processor with distributed register files. Rematerialization is a strategy for register allocator to determine if it is cheaper to recompute the value than to use memory load/store. In the paper, we propose a solution to exploit the characteristics of distributed register files where there is the chance to balance or split live ranges. By heuristically estimating register pressure for each register file, we are going to treat them as optional spilled locations rather than spilling to memory. The choice of spilled location might preserve an expression result and keep the value alive in different register file. It increases the possibility to do expression rematerialization which is effectively able to reduce spill code. Experiments were done for the PAC VLIW DSP processor and based on Open64 compiler infrastructures. Early experimental results show that our approach can reduce memory access operations due to the well-partitioned live ranges and well-rematerialized expression values.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Copy Propagation Optimizations for VLIW DSP Processors with Distributed Register Files

High-performance and low-power VLIW DSP processors are increasingly deployed on embedded devices to process video and multimedia applications. For reducing power and cost in designs of VLIW DSP processors, distributed register files and multi-bank register architectures are being adopted to eliminate the amount of read/write ports in register files. This presents new challenges for devising com...

متن کامل

ORC2DSP: Compiler Infrastructure Supports for VLIW DSP Processors

In this paper, we describe our experiences in deploying ORC infrastructures for a novel 32-bit VLIW DSP processor (known as PAC core), which equips with new architectural features, such as distributed and ‘ping-pong’ register files. We also present methods in retargeting ORC compilers for PAC VLIW DSP processors. In addition, mechanisms are proposed to incorporate register allocation policies i...

متن کامل

LC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with non-uniform register files

Embedded processors developed within the past few years have employed novel hardware designs to reduce the ever-growing complexity, power dissipation, and die area. Although using a distributed register file architecture is considered to have less read/write ports than using traditional unified register file structures, it presents challenges in compilation techniques to generate efficient code...

متن کامل

A Local-Conscious Global Register Allocator for VLIW DSP Processors with Distributed Register Files

Embedded processors developed in recent years have attempted to employ novel hardware design to reduce ever-growing complexity, power dissipation, and die area. While using a distributed register file architecture with irregular accessing constraints is considered to be an effective approach rather than traditional unified register file structures, conventional compilation techniques are not ad...

متن کامل

LC-GRFA: global register file assignment with local consciousness for VLIW DSP processors with irregular register files

Embedded processors developed within the past few years have employed novel hardware designs to reduce the ever-growing complexity, power dissipation, and die area. While using a distributed register file architecture with irregular accessing constraints is considered to have less read/write ports than using traditional unified register file structures, conventional compilation techniques can n...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009